{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Text embedding models \n",
    "文本嵌入模型\n",
    "\n",
    "> The Embeddings class is a class designed for interfacing with text embedding models. There are lots of embedding model providers (OpenAI, Cohere, Hugging Face, etc) - this class is designed to provide a standard interface for all of them.<br>\n",
    "Embeddings 类是专为与文本嵌入模型交互而设计的类。有很多嵌入模型提供者（OpenAI、Cohere、Hugging Face 等）——这个类旨在为所有这些提供一个标准接口。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The base Embeddings class in LangChain provides two methods: one for embedding documents and one for embedding a query. The former takes as input multiple texts, while the latter takes a single text. The reason for having these as two separate methods is that some embedding providers have different embedding methods for documents (to be searched over) vs queries (the search query itself).\n",
    "\n",
    "LangChain 中的基本 Embeddings 类提供了两种方法：一种用于嵌入文档，另一种用于嵌入查询。前者采用多个文本作为输入，而后者采用单个文本。将这些方法作为两种独立方法的原因是，某些嵌入提供程序对文档（要搜索）与查询（搜索查询本身）具有不同的嵌入方法。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain_openai import OpenAIEmbeddings\n",
    "\n",
    "embeddings_model = OpenAIEmbeddings()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## embed_documents"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(5, 1536)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "embeddings = embeddings_model.embed_documents(\n",
    "    [\n",
    "        \"Hi there!\",\n",
    "        \"Oh, hello!\",\n",
    "        \"What's your name?\",\n",
    "        \"My friends call me World\",\n",
    "        \"Hello World!\"\n",
    "    ]\n",
    ")\n",
    "len(embeddings), len(embeddings[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## embed_query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0.005377274017660775,\n",
       " -0.000652777962841794,\n",
       " 0.0389802907525645,\n",
       " -0.0029673974856661723,\n",
       " -0.008834563850488029]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "embedded_query = embeddings_model.embed_query(\"What was the name mentioned in the conversation?\")\n",
    "embedded_query[:5]"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "langchain0_1",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
